Semi-supervised Dependency Parsing using Bilexical Contextual Features from Auto-Parsed Data
نویسندگان
چکیده
We present a semi-supervised approach to improve dependency parsing accuracy by using bilexical statistics derived from auto-parsed data. The method is based on estimating the attachment potential of head-modifier words, by taking into account not only the head and modifier words themselves, but also the words surrounding the head and the modifier. When integrating the learned statistics as features in a graph-based parsing model, we observe nice improvements in accuracy when parsing various English datasets.
منابع مشابه
Improving Dependency Parsing with Subtrees from Auto-Parsed Data
This paper presents a simple and effective approach to improve dependency parsing by using subtrees from auto-parsed data. First, we use a baseline parser to parse large-scale unannotated data. Then we extract subtrees from dependency parse trees in the auto-parsed data. Finally, we construct new subtree-based features for parsing algorithms. To demonstrate the effectiveness of our proposed app...
متن کاملAmbiguity-aware Ensemble Training for Semi-supervised Dependency Parsing
This paper proposes a simple yet effective framework for semi-supervised dependency parsing at entire tree level, referred to as ambiguity-aware ensemble training. Instead of only using 1best parse trees in previous work, our core idea is to utilize parse forest (ambiguous labelings) to combine multiple 1-best parse trees generated from diverse parsers on unlabeled data. With a conditional rand...
متن کاملExperiments on Semi-supervised Dependency Parsing of a Morphologically Rich Language
This paper1 presents a set of preliminary experiments that have the aim of improving dependency parsing of Basque by using a semi-supervised technique. Our approach will make use of large unannotated corpora (over 140M word forms). We will investigate the use of information induced from a large raw corpus as well as an automatically parsed version. The first results show encouraging improvement...
متن کاملSemi-Supervised Feature Transformation for Dependency Parsing
In current dependency parsing models, conventional features (i.e. base features) defined over surface words and part-of-speech tags in a relatively high-dimensional feature space may suffer from the data sparseness problem and thus exhibit less discriminative power on unseen data. In this paper, we propose a novel semi-supervised approach to addressing the problem by transforming the base featu...
متن کاملWorking with a small dataset - semi-supervised dependency parsing for Irish
We present a number of semi-supervised parsing experiments on the Irish language carried out using a small seed set of manually parsed trees and a larger, yet still relatively small, set of unlabelled sentences. We take two popular dependency parsers – one graph-based and one transition-based – and compare results for both. Results show that using semisupervised learning in the form of self-tra...
متن کامل